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Abstract. In this paper we develop the large deviations principle and a rigorous mathematical 
framework for asymptotically efficient importance sampling schemes for general, fully dependent 
systems of stochastic differential equations of slow and fast motion with small noise in the slow com- 
ponent. We assume periodicity with respect to the fast component. Depending on the interaction of 
the fast scale with the smallness of the noise, we get different behavior. We examine how one range 
of interaction differs from the other one both for the large deviations and for the importance sam- 
pling. We use the large deviations results to identify asymptotically optimal importance sampling 
schemes in each case. Standard Monte Carlo schemes perform poorly in the small noise limit. In 
the presence of multiscale aspects one faces additional difSculties and straightforward adaptation of 
importance sampling schemes for standard small noise diffusions will not produce efficient schemes. 
It turns out that one has to consider the so called cell problem from the homogenization theory for 
Hamilton-Jacobi-Bellman equations in order to guarantee asymptotic optimality. We use stochastic 
control arguments. 
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1. Introduction 

Let us consider the m + [d — m) dimensional process (X'^,y^) = {(X^(s), y^(s)), < s < T} 
satisfying the system of stochastic differential equations (SDE's) 



dX'is) = -b{X'{s),Y'{s)) + c{X'{s),Y'{s)) ds + ./~ea{X%s),Y'{s))dW{s), 
Id J 

dV^is) = i iX%s),Y%s)) + g {X^s), Y^s))] ds + ^ [n {X^{s),Y^{s)) dW{s) + 

(1.1) +T2{X^{s),Y%s))dB{s)], 
X'{0) = xo, Y'{0)=yo 

where 5 = 6{e) | as e | and {W{s),B{s)) is a 2ft;— dimensional standard Wiener process. The 
functions b{x,y),c{x,y),a{x,y), f{x,y),g{x,y),Ti{x,y) and T2{x,y) are assumed to be sufficiently 
smooth (see Condition 12. ip and periodic with period A in every direction with respect to the second 
variable. 

One can interpret the system of (jl.ip as a system of slow and fast motion with X"" playing the role 
of the slow motion and playing the role of the fast motion. The goal of this paper is to provide a 
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large deviations analysis of (jl.ip that allows to rigorously develop the importance sampling theory 
for estimation of functionals such as 

(1.2) ^(e) = E[e-^^(^^(^))|X^(0) =xo,y^(0) =yo]. 

Importance sampling is a variance reduction technique in Monte Carlo simulation. As it is well 
known, standard Monte Carlo sampling techniques perform very poorly in that the relative errors 
under a fixed computational effort grow rapidly as the event becomes more and more rare. Esti- 
mating rare event probabilities in the context of slow-fast systems presents extra difficulties due to 
the underlying fast motion and its interaction with the intensity of the noise e. 

Depending on the order that e, 6 go to zero, we have three different regimes of interaction: 



(1.3) lim^ 



oo Regime 1, 

7 G (0, oo) Regime 2, 
Regime 3. 



If 5 goes to zero faster than e (Regime 1) then homogenization occurs first, whereas if e goes 
to zero faster than 6 (Regime 3) then large deviations theory tells how quickly (jl.ip converges to 
the averaged deterministic ODE given by setting e equal to zero. If the two parameters go to zero 
together then one has an intermediate situation (Regime 2). 

The study of rare events in the multiscale context is a difficult problem due to the presence of 
the underlying fast motion. The first necessary step is to develop the associated large deviations 
theory. Using weak convergence arguments the authors in [H] prove the large deviations principle 
for the special case f = b, g = c,ti = a and T2 = 0. We extend the results of [Hj to the current 
general setup. Then, using the large deviation results and stochastic control arguments we construct 
asymptotically optimal importance sampling schemes with rigorous bounds on performance. The 
construction is based on subsolutions for an associated Hamilton-Jacobi-Bellman (HJB) equation 
as in [m [18]. The situation here is complicated due to the presence of the fast motion. It turns 
out that changes of measure that are implied by the homogenized system do not lead to efficient 
importance sampling schemes. The standard arguments have to be modified taking into account the 
solution to the related "cell problem" which is different for each regime. This is also tightly related 
to the homogenization theory for HJB equations. A control in full feedback form, i.e., a function 
of both the slow variable X'' and the fast variable V", is used to construct dynamic importance 
sampling schemes with precise asymptotic performance bounds. The control involves both the 
solution to the appropriate homogenized HJB equation and to its corresponding cell problem. 

The novelty of this work lies in developing (a) the large deviations principle and (b) a general 
and rigorous mathematical framework for the study of importance sampling schemes for systems 
of slow-fast motion as in (jl.ip for all three regimes of interaction, (|1.3p . Multiscale stochastic 
control problems and related large deviations problems have been studied elsewhere as well under 
various assumptions and dependencies of the coefficients of the system on the slow and fast motion, 
see [a El [II |22l [Ml [281 [23 IM [Ml EZl [38]. The papers [22l [231 [SH [371 [38] address the large 
deviations principle for Regime 2 for special cases of dependence of the coefficients on (x, y). With 
the exception of [14j [29] , they express it through a Legendre-Fenchel transform of the limit of the 
normalized logarithm of an exponential moment or of the first eigenvalue of an associated operator. 
Here we provide an explicit characterization of the action functional. Also, the large deviations 
arguments in the aforementioned papers do not cover the full nonlinear case that we study here and 
do not seem to provide insights into how to construct asymptotically efficient importance sampling 
schemes. Some related importance sampling results on this problem have been recently obtained 
in [15]. There the authors study the special case oi f = b, g = c,ti = a and T2 = for Regime 1 
only and provide simulation studies for that particular case as well. It is also demonstrated there 
that straightforward adaptation of importance sampling schemes for standard diffusion processes 



(without the multiscale aspect) will have poor results in the multiscale setting. This translates 
in that one needs to consider the solution to the cell problem in problems with multiple scales in 
order to guarantee good asymptotic performance. The treatment of the general case, that is the 
content of the current paper, requires additional considerations. In particular, the identification of 
the optimal control and of the associated subsolutions and cell problems are more involved here 
even for Regime 1. The case of Regimes 2 and 3 is studied in this paper for the first time. This 
work is closely related to the homogenization theory of HJB equations, e.g., [H [2l O [201 113 l3T] . 
see Section [5l 

We note here that one can relax the periodicity assumption both for the large deviations and for 
the importance sampling. In particular, in the case of Regime 1 using the results and methodology 
of \34:\ |T3] and of the present paper, we can prove an analogous result when the fast variable takes 
values in instead of the torus. In the case of Regime 2, the extension to the whole space with 

full dependence of the coefficients on (x, y) is more involved. However, it seems plausible that the 
methods of the current paper can be combined with those of [26^ [5l [E] to weaken the periodicity 
assumption for Regime 2 as well. This will be addressed elsewhere. 

The need to simulate rare events occurs in many application areas including telecommunication, 
finance, insurance and chemistry. We present some examples in Section [6l A model of interest 
in chemical physics and chemistry is the first order Langevin equation in a rough potential, e.g. 
[3Ql [Ml ESI [H [39]. This is a special case of the system with f = b = -VQ{y), g = c = 

—W{x), Ti = a = constant and T2 = and is discussed in Subsection 16. 1[ Another example, 
discussed in Subsection 16.21 is related to short time asymptotics of a process that depends on 
another fast mean reverting process. 

The rest of the paper is organized as follows. In Section [2] we introduce necessary notation 
and our assumptions. Section [3] is devoted to the related large deviations theory. In Section [4] we 
develop the importance sampling theory for all three possible regimes of interaction that guarantees 
asymptotic optimality. In Section [5] we discuss the connection of the importance sampling theory 
with the homogenization of HJB equations. We conclude with Section [6] where we examine how 
our results look like in some special cases of interest. 



2. Notation and assumptions 

In this section we establish some notation and lay out our main assumptions. Let us assume a 
filtered probability space (0,5^, P) equipped with a filtration that satisfies the usual conditions, 
namely, is right continuous and 5o contains all P-negligible sets. 

The main assumption for the coefficients of (jl.ip is as follows. 

Condition 2.1. (i) The functions b{x,y),c{x,y),a{x,y), f{x,y),g{x,y),Ti{x,y) andT2{x,y) 
are bounded in both variables and periodic with period A in the second variable in each 
direction. We additionally assume that they are C^(M.'^~"^) in y and C^(M"') in x with all 
partial derivatives continuous and globally bounded in x and y. 
(ii) The diffusion matrices aa^ and tiTi + T2T2 are uniformly nondegenerate. 

Under Condition 12.11 the system (jl.ip has a unique strong solution. The smoothness assumptions 
are stronger than necessary, but they guarantee smoothness and boundedness of the associated cell 
problems that will appear in the development of the importance sampling theory. For notational 
convenience we define the operator • : •, where for two matrices A = [aij],B = [bij] 

A: B = y^^ajjbjj. 

i,j 
3 



Let 3^ = T'^~"^ be the {d — m)-dimensional torus. This is the state space of the fast motion. For 
the purposes of consistency with the related hterature we use similar notation as in [14^ [T5] with 
the appropriate modifications in order to cover the more general set-up that we treat here. 

Under Regime 1, we also impose the following condition. 

Condition 2.2. Let F £ (y-, M) and consider the operator 

ClF{y) = fix, y) ■ VyF{y) + ^ (nrf + t^t^) (x, y) : V,V,F(y) 

equipped with periodic boundary conditions in y. Under Regime 1, we assume the centering condition 
(see m)- 

/ Kx,y)Kdy\x) = 0, 
Jy 

where fi{dy\x) is the unique invariant measure corresponding to the operator C^.. 

Under Conditions 12. II and 12. 21 Theorem 3.3.4 in [6] guarantees that for each i € {1, . . . , m} there 
is a unique, twice differentiable, with all partial derivatives up to second order bounded, A— periodic 
in each direction function Xii^^y) that satisfies the cell problem: 

(2.1) Clxe{x,y) = -bi{x,y), / xe{x,y)fi{dy\x) = 0. 

Jy 

We write x = ixi,-- ■,Xm)- 

Let us denote Z = M'^. This will be the space in which the control processes that will appear in 
the next sections take values. 

Definition 2.3. For {x, y, zi, Z2) G x y x Z x Z and for Regime i = 1,2,3 defined in il.3\) we 
define the operators Ci^^z2,x- For i = 1,2 we let V{C\^^) = C'^{y) and for i = 3, V{Cl^^) = C^{y). 
For F G V{C}^ .^) define ' ' 

ClF{y) = f{x,y) ■ VyF{y) + ^ (nrf + rarj) {x,y) : VyVyF{y) 

^li,z2,xF{y) = [jf{x,y)+g{x,y) +Ti{x,y)zi +T2{x,y)z2] ■VyF{y)+j^ {nrf + T2T^) {x,y) : VyVyF{y) 
^\,z2,xF{y) = [9{x, y) + ri{x, y)zi + T2{x, y)z2\ ■ ^yFiy). 

Definition 2.4. For {x, y, zi,Z2) G xy x Z x Z and for Regime i = 1, 2, 3 defined in !il.3\) we 

define the functions Xi{x, y, 2:1,22) : I^™" x y x Z x Z —?■ by 

dx dx 
Xi{x,y,zi,Z2) = c{x,y) + —{x,y)g{x,y) + a{x,y)zi + -^(^jy) {n{x,y)zi + T2{x,y)z2) 

\2{x,y,zi,Z2) = lb{x,y) + c{x,y) + a{x,y)zi 
h{x,y,zi,Z2) = c{x,y) +a{x,y)zi, 
where x = ixi-, ■ ■ ■ ■• Xm) is defined by ^2. 

For a Polish space S, let V{S) be the space of probability measures on S. Next we recall the 
notion of viability as defined in |14j . 

Definition 2.5. A pair (V^,P) G C([0,T];M™) x V{Z x Z xy x [0,r]) will be called viable with 
respect to (A,£) and write (V'jP) G V(^\^c)' if the following hold: 

• The function ipt is absolutely continuous. 

• The measure P is square integrable in the sense that fzxZxyxlo Tl P{dzidz2dyds) < 00. 



• For all t G [0, T] 

(2.2) il;^ = xo+ X{'il^s,y,zi,Z2)P{dzidz2dyds) 

JzxZxyx[o,t] 

• For all t € [0, T] and for every f G T^{C) 

(2.3) / [ C,,^,,,^J{y)F{dzidz2dyds) = 0, 

Jo JzxZxy 

• For all t G [0, T] 

(2.4) F{Zx Zxyx[0,t])=t. 



Notice that equation (|2.4p implies that the last marginal of P is Lebesgue measure, and hence P 
can be decomposed in the form F{dzidz2dydt) = Pt{dzidz2dy)dt. 



3. Large deviations principle 

The authors in [T3] establish the large deviations principle related to (jl.ip in the special case 
oi f = b,g = c,Ti = a and T2 = 0. We extend the results of [H] to the current general setup. 
A uniform approach to the large deviations problem for (jl.ip is presented, allowing to essentially 
treat all three regimes with the same general strategy, even though the technical details might be 
different from regime to regime. Moreover, in the course of the proof of the large deviations lower 
bound, we need to construct a nearly optimal control that attains the large deviations bound. As 
we will see in Section this control can guide the construction of efficient importance sampling 
for the estimation of quantities such as (|1.2p . 

Essentially, in each regime, the action functional is given by the infimization of a quadratic 
functional, where the infimum is determined by the averaging of an appropriate controlled version 
of the limiting slow motion with respect to the corresponding fast motion. Both the limiting slow 
motion and the fast motion with respect to which the averaging is being done, differ from regime to 
regime. This is related to the notion of viability from Definition 12 . 51 where the viable pairs (A, C) are 
obtained from Definitions 12.31 and 12.41 for each regime. What defers from the special case considered 
in |14j is the form of the appropriate viable pair in each case. We present this characterization 
below. 

In preparation for stating the main large deviations results, we recall the concept of a Laplace 
principle. 

Definition 3.1. Let {X^,e > 0} be a family of random variables taking values in a Polish space S 
and let I be a rate function on S. We say that {X'', e > 0} satisfies the Laplace principle with rate 
function L if for every bounded and continuous function /i : 5 — )■ M 

(3.1) lim-elnE 

If the level sets of the rate function (equivalently action functional) are compact, then the Laplace 
principle is equivalent to the corresponding large deviations principle with the same rate function 
(Theorems 2.2.1 and 2.2.3 in [T3]). 

The derivation of the large deviations and importance sampling results are based on a variational 
representation for functionals of Wiener process derived in [8] that allows to rewrite the prelimit 
left hand side of (|3.ip . Let Z{-) be a standard n-dimensional Wiener process and F{-) a bounded 
and measurable real- valued function define on the set of M"— valued continuous functions on [0, T]. 
By Theorem 3.1 in [8] we have 

5 



exp 



inf [L{x) + h{x) 



(3.2) -logE[exp{-F(Z(-))}] = inf E 



u{s)fds + F { Z{-) + / u{s)ds 



where A is the set of ah 5^^— progressively measurable n-dimensional processes u = {u{s), < s < T} 
satisfying 

E 

'0 

In the present case, let Z{-) = {W{-), B{-)) and n = 2k. Under Condition 12. 11 the system has (jl.ip 
has a unique strong solution. Therefore and are measurable functions of Z{-) = {W{-), B{-)). 
After setting F{Z{-)) = h{X'^{-))/e and rescaling the controls by we get the representation 



\ f \\u{s)\\ ds < oo, 
Jo 



(3.3) 



elnE, 



xo,yo 



exp 



inf E, 



1 



T 



\ui{s)f + \\U2{S)\\^ ds + h{X') 



where the pair (X^jV^) is the unique strong solution to 



dX'{s) 
dY'{s) 



-b {X%s),Y%s)) + c {X^s), Y^is)) + a (X^ F/) u,is) ds + V~ea {X^{s),Y^{s)) dW{s), 
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:f {X^{s),Y^{s)) + g {X^{s),Y^{s)) + n {X^{s), Y%s)) ui{s) + {X^{s),Y^{s)) U2{s) 



(3.4) 

X'{0) = xo, Y'{0)=yo 



+^ [n (X^(s),y^(s)) dM^(s) +r2 {X%s),Y%s)) dB{s) 



Therefore in order to derive the Laplace principle for {X*^}, it is enough to study the limit of 
the right hand side of the variational representation (13. 3p . The first step in doing so is to consider 
the weak limit of the slow motion X"" of the controlled couple ()3.4p . Due to the involved controls, 
it is convenient to introduce the following occupation measure. Let A = A(e) J, as e J, 0. The 
role of A(e) is to exploit a time-scale separation. Let Ai,A2,B,T be Borel sets of Z,Z,y, [0,T] 
respectively. Let £ Ai,i = 1,2 and let (X|,y/) solve (13. 4p with in place of Ui. We associate 
with {Xg,Yg) and a family of occupation measures P"^'^ defined by 

ft+A 



P'"^(^i X A2X B xF) 



lAAulis))lA2iul{s))lB in mod A) ds 



dt. 



We assume that ul{s) = for i = 1, 2 if s > T. 

Theorem 13.21 deals with the limiting behavior of the controlled process (13. 4p under each of the 
three regimes, and uses the notion of a viable pair. 

Theorem 3.2. Assume Condition \2.1\ and under Regime 1 assume Condition \2.2\ . Fix the initial 



point {xo,yo) G 



hd—m 



and consider a family {u"^ = {u\,u\), e > 0} of controls in A satisfying 



supE 

e>0 Jo 



T 



\u\{s)r + \\u\{s) 



ds < 00 



Then the family {(X*^, P*^'^), e > 0} is tight. Given the particular regime of interaction i = 1,2,3 
and given any subsequence 0/ {(X^, P^'"^), e > 0}, there exists a sub sub sequence that converges in 
distribution with limit (X*,P*). With probability 1, the limit point (X*,P*) G ^[x^^o)' (according to 
Definition \2.5l with the pairs given by Definitions\2.3\ and\2.4\ 



Proof. The proof is analogous to that of Theorem 2.8 in [14j . For completeness, let us sketch the 
main features of the arguments. Tightness of {X^, e > 0} follows from the easily obtained estimate 
that for every r/ > 



lim lim sup ^xo,ya 



sup |X,^^-X4|>77 

\ti-t2\<pfi<ti<t2<T 



0. 



The only regime that this estimate needs some discussion is Regime 1, due to the unclear behavior 
of the term | b (^X''{s),Y'^{s)) ds as e/6 t oo. We treat this term by applying Ito formula to 
x{x,y), the solution to the cell problem ()2.ip . Essentially, this allows to replace the drift term 

ft 



by the term 



-b {X'{s),Y^{s)) ds + c {X'{s),Y\s)) + a {Xl^) u,{s) 



Ai {X'{s),Y'is),ui{s),U2is))ds 



ds, 



plus lower order terms that vanish as e J, due to Condition 12. li The integral term in the last 
display does not have an e/6 term and then tightness follows the standard way. The details are 
omitted. 

Tightness of the occupation measures {P'^'^,e > 0} follows from the bound 



sup E^.o,j^„ [g(P^''^)] < oo. 
ee(o,i] 



I l|2 , II ||2 



r{dzidz2dydt), r £ 'P{Zxyx[0,T]), 



for the tightness function g{r) = fzxZxyx[OT] 
see Theorem A. 19 in [IB] . 

Having established tightness, ()2.2p follows from the characterization of solutions to SDE's via 
the martingale problem formulation and the averaging principle, see [BlIlQ^fT^. Lastly, p.3|) follows 
from the following argument. Let A^^ ^ be the operator associated with the fast motion Y'^ in 
()3.4p with z\ = u\, Z2 = U2 and x = X^ fixed. 



21,22,2-' 



■^f{x,y) + ^ [g{x,y) +Ti{x,y)zi +T2{x,y)z2] 



(3.5) 



e 1 



+^2 (^^^1^ + ^2^2^") {x,y) : yy^yFiy) 



for functions F G C^iy). Also notice that for F G C'^{y) 



Mf 



F{Y^{t))-F{y,) 



A 







u\{s) ,U2{s) {s) 



,,.F{Y^{s))ds 



under Regime z = 1,3 and to "fCl_^ ^ under 



is an 5^(-martingale. Next, let 

— Regime 1 

ffC^) — \ ^ Regime 2 

b Regime 3 

and notice that g{e)A\^ ^ converges to x 

Regime i = 2, as e | 0. This and the fact that g{e)Mf | as e | in probability, allows to 

obtain that /^^^^-y^jQ jjEF(y)P'^'^((izi(iz2C^2/'^i) converges to zero in probability. Then, 

the statement follows. □ 

Next we state the main large deviations result. The main difference from the case considered 
in |14j is the identification of the correct viable pair with respect to which the large deviations 
principle is expressed to. The proper viable pair in each regime is indicated by Theorem 13.21 
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Theorem 3.3. Let {{X'',Y'') , e > 0} he the unique strong solution to and consider Regime 

i = 1,2,3. Assume Condition \2.1\ and under Regime 1 assume Condition \2.S\ Define 



(3.6) 



5* 



inf 



I l|2 , II ||2 



F{dzidz2dydt) 



IZxZxyx[0,T 

with the convention that the infimum over the empty set is oo. The pairs {Xi,C 
Definitions \2.3\ and \2.4\ Then, we have 

(i) The level sets of are compact. In particular, for each s < oo, the set 

<^l = {<PeC{[0,T];Rn:S\^)<s} 

is a compact subset of C{[0,T];W^). 

(ii) For every bounded and continuous function h mapping C([0, T]; R™) into M 

h{X' 



are given m 



lim inf — e In Et-,, y„ 



exp 



> inf \SH>p) + h 

(/)eC([0,T];R™) '- 



(iii) In the case of Regime 3 assume either that we are in dimension 1 i.e.,m = l,d = 2, or that 
g{x,y) = g{y) and Ti{x,y) = Ti{y),i = 1,2 for the general multidimensional case. Then 
for every bounded and continuous function h mapping C([0, T]; M™) into M 



lim sup — elnE 
40 



exp 



< inf \S'U) + h 

</.eC([0,T];R™) '- 



In other words, under the imposed assumptions, {X^,€ > 0} satisfies the large deviations principle 
with action functional . 

Proof. The proof is analogous to that of Theorem 2.10 in [T3]. For completeness, we sketch the 
main features of the arguments. 

Part (i) follows by noticing that any convergent subsequence of a viable pair {(0",P"),n > 0}, 
with uniformly (in n) square integrable P", is a viable pair with respect to the same (A, C). Then, by 
Fatou's Lemma applied to S{(j)'^), the functional S{(j)) is lower semicontinuous. Then compactness 
of follows. 

Part (ii) follows the representation formula (j3.3|) and Theorem 13.21 using Fatou's Lemma. 

Part (iii) is the most challenging part of the proof. In each regime we follow the same general 
steps. What differs from regime to regime, is the form of the viable pair (Aj,i2*) in the definition 
of the action functional 5'*(-). To prove the Laplace principle upper bound we must show that for 
all bounded, continuous functions h mapping C{[0,T];W^) into R 

h{X' 



lim sup-elnE^.Q,j^o 



exp 



< inf \SH(I)) + h 

</>6C([0,T];IR™) '- 



By the variational representation formula (|3.3p . it is enough to prove that 



(3.7) lim sup inf E^.(j 



40 



\ui{s)f + \\U2{S)\\^ ds + h{X') 



< inf [S\(t)) + h{(t))\ 



In each regime, we consider for the limiting variational problem in the Laplace principle a nearly 
optimal control pair ('0,P). In particular, let > be given and consider ip £ C([0, T];M'") with 
ipQ = xq such that 

S^iIj) + hii,) < inf \S\(t>) + h{(t>)\+r]<oo. 

<t>(^C{[0,T]-M.^) '- 

Based on the constraints that ■0 and P satisfy through the corresponding viability property, we 
construct or prove the existence for each regime of a control, u[s) = {ui{s),U2{s)) for the prelimit 



representation (left hand side of (jS-Tp l that leads to controlled processes and controls converging to 
il^ and P respectively. For the sake of presentation, the nearly optimal controls and their properties 
are presented in Theorem [33] below. Once this control has been obtained, the proof of (|3.7p follows. 

□ 

The reader may wonder, why we have imposed further structural restrictions for part (iii) of the 
theorem for Regime 3. This is because we were not able to prove some smoothness requirements 
of the constructed nearly optimal controls in the prelimit level with respect to x in the general 
multidimensional case when the coefficients g, ti and T2 depend on x. Similar issues arise in the 
case considered in [H] and are discussed in detail there. However, observing the viable pairs that 
characterize the large deviations principle for Regimes 2 and 3, {X2,C^) and (A3,£^) respectively, 
we notice that Regime 3 can be thought of as a limiting case of Regime 2 with 7 = 0. So, one is led 
to conjecture that the extra assumptions for Regime 3 are not necessary, even though we currently 
do not have a proof for this. 

While Theorem 13.31 provides a convenient representation for the action functional, it would be 
desirable to have more explicit formulas. This is the content of Theorem [331 These representations 
also lead to the construction of the nearly optimal controls that are necessary for the proof of part 
(iii) of Theorem 13.31 This becomes clear in Theorem 13.51 

Let ^C([0, T]; M™") be the set of absolutely continuous functions from [0,T] to M™. 



Theorem 3.4. Consider the set-up of Theorem \3.3[ Then, fori = 1,2,3 we have 
5^(0) 



Li{(j){s),<j){s))ds if(j)£ ^C([0,r];M™) and 0(0) = xq 
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+00 otherwise 



where 



(3.8) Liix,P)= inf. \^ I \\v{y)ffi{dy) 



with 

t>2K 



Alp = ^v{-) = {vi{-),V2{-)):y^R''',fier{y) ■. {v,f^) satisfy j^Cl^^y^,^^.^^,^F{y)^i{dy)=Q 

for all F £ V (Cl^ .^^ ,^) , \\v{y) f ^{dy) < 00 and (3 = / Xi{x,y,vi{y),V2iy))^iidy) 

Jy Jy 

Proof. Notice that (j3.6p can be written in terms of a local rate function 

S\<P)= r L\{4>{s)A{s))ds, 
Jo 

(if (p is absolutely continuous). This follows from the definition of a viable pair by setting 
(3.9) Ll{x,l3)= inf / ]- \\\zif + \\z2f]F{dzidz2dy), 



where 



= h€ViZxZxy): [ C\^^,^,,F{y)V{dzrdz2dy) = for all F G P (£1^,,^,,) 
I JzxZxy 



L 



I l|2 , II ||2 



P{dzidz2dy) < 00 and P = j Xi{x,y, zi, Z2)F{dzidz2dy) 

ZxZxy 



izxzxy 

Next we note that V £V{Z x Z x y) can be decomposed into marginals as follows 

V{dzidz2dy) = 7]{dzidz2\y)n{dy). 



This, the convexity of the cost on (2:1,2:2) and the affine dependence of Aj on (21,22) imply that 
the relaxed control formulation ()3.9p and the ordinary control formulation (j3.8p are equivalent by 
taking 



Vi{y) = / Zir]{dzidz2\y). 
JzxZ 

This concludes the proof of the theorem. □ 

Now, we use the representations in Theorem 13.41 to obtain the controls needed in the proof of 
part (iii) of Theorem 13. 3[ In the case of Regime 1 we can be even more specific and obtain a 
closed form expression for the variational problem associated to the local rate function Li{x,P) 
appearing in Theorem 13.41 The derivation of the closed form expression is based on identifying an 
optimal control that is then used to prove Theorem 13.31 The proof of this statement is based on a 
straightforward Lagrange multiplier type of analysis of the variational problem (j3.8p for i = 1 and 
thus omitted (see also Theorem 5.2 in [T3] for an analogous situation). In the case of Regime 2 
and Regime 3, we can obtain that there is pair {v,fi) that attains the infimum in ()3.8p . (see also 
Theorems 6.2, 7.1 and 7.2 in for an analogous situation). We collect these statements in the 
following theorem. The proof is omitted, since it follows along the lines of the corresponding proofs 
of Theorems 5.2, 6.2, 7.1 and 7.2 of [H]. 

Theorem 3.5. Assume condition \2.1\ and in the case of Regime 1 assume Condition \2.S[ The 
infimization problem liS. ^) for i = 1 has the explicit solution 

L,ix, (3) = ^iP- r{x)fq-\x){P - r(x)), 

where 

• r{x) = jy (^c{x,y) + ^{x,y)g{x,y)^ l^{dy\x), 



y 



{x,y)fM{dy\x), 



and where fi{dy\x) is the unique invariant measure corresponding to the operator C], and x{x,y) 
is defined by \2.1\) . The control 

u{y) = iui,i3{x,y),U2,i3{x,y)) = ( (^^+§^^1^ {x,y)q~^{x){(3 - r{x)), (^^T2^ {x,y)q~^{x){P - r{x)) 



attains the infimum in /i3.8]) . 

In the case of Regime 2 and in the one dimensional setting of Regime 3, there is a pair (n, fi) 
that achieves the infimum in Ii3.8\) such that u = Ui3{x,y) is, for each fixed (3 G W^, continuous in 
X, Lipschitz continuous in y and measurable in {x,y,/3). Moreover, fl{dy) = flu{dy\x) is the unique 
invariant measure corresponding to the operator ^^\p(^xy)x' ^ ~ ^'^^ ^-^ weakly continuous as 
a function of x. In the x— independent multidimensional case of Regime 3, there zs a P € Ji^f that 
achieves the infimum in \3.9^) or equivalently in k3.^) . 

Putting Theorems 13.41 and 13.51 together, we immediately get the following characterization for 
Regime 1. 

Theorem 3.6. Let {(X'^,y^) ,e > 0} be the unique strong solution to and consider Regime 1. 

Under Conditions \ 2. 1\ and \2.2[ {X^,e > 0} satisfies a large deviations principle with rate function 

' -2 loi^i^) - r{(l){s))fq-\^{s)){<P{s) - r{ct){s)))ds if (j> e AC{% T]; M™) and 0(0) = xq 
+00 otherwise. 
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s{4>) 



Notice that the coefficients r(x) and q{x) that enter into the action functional for Regime 1 are 
those obtained if we had first taken to (jl.ip 6^0 with e fixed and then consider the large deviations 
for the homogenized system. Indeed if e = 1, then X'^'^ = X^'^ can be shown to converge weakly 
in the space of continuous functions in C([0, r];]R™), as (5 J, 0, to the solution of an SDE with drift 
coefficient r(x) and diffusion coefficient q^^'^{x). This can be derived via standard homogenization 
theory (6l [35]. The action functional for a small noise diffusion with drift coefficient r{x) and 
diffusion coefficient ^/eq^^'^{x) is the one given by Theorem 13.61 This is in accordance to intuition 
since under Regime 1, 6 goes to zero faster, so homogenization should occur first as it indeed does. 

Remark 3.7. Notice that if we set f = b,g = c,a = ti and T2 = in the statements of Theorems 
then one recovers the results of [H 



4. Importance Sampling 

The purpose of this section is to utilize the large deviations results of Section [3] in order to 
obtain asymptotically efficient importance sampling schemes for quantities like ()1.2p . Simulation 
problems involving rare events unavoidably have a number of mathematical and computational 
challenges. As it is well known, standard Monte Carlo sampling techniques perform very poorly 
in that the relative errors under a fixed computational effort grow rapidly as the event becomes 
more rare. Rare event estimation problems for systems of fast and slow motion present extra 
difficulties due to the underlying fast motion and its interaction with the intensity of the noise e. In 
particular, one needs to take into account the solution to the appropriate cell problem associated 
with the homogenization theory of HJB equations in order to guarantee asymptotic optimality. 
Related simulation results are provided in [I5l[l6] for the special case f{x,y) = h{x,y) = —VQ{y), 
g{x,y) = c{x,y) = —W^x), a{x,y) = Ti{x,y) = constant and T2{x,y) = 0. 

We start by reviewing general things about importance sampling adjusting the discussion to our 
setting of interest. Consider a bounded continuous function h : i— > R and suppose that one is 
interested in estimating 

e{e) = E[e--M^'^^^^\X^{to) = xo,Y^{to) = yo] 

by Monte Carlo, where the pair of slow and fast motion {X^,Y'') has initial point X^{tQ) = 
xo,Y^{to) = yo- For Regime i = 1,2,3, let 

(4.1) G.(to,xo) = inf + M0(r))] . 

06C([io,T];R™'),</)(to)=xo 

As we shall see below, under regularity conditions, the function Gi{t,x) satisfies a PDE of HJB 
type. Now, depending on the regime of interaction, the contraction principle implies 

(4.2) lim-elog0(e) = Gi(to,xo). 

e-S>0 

Notice that the limit is independent of the initial point yo of the fast motion Y^. This is due to the 
averaging that takes place, as we shall also see later on in the rigorous proofs. 

Let r^(to, xo, yo) be any unbiased estimator of 6{e) that is defined on some probability space with 
probability measure P. With E denoting the expectation operator associated with P we have that 
T''{tQ,XQ,yQ) is a random variable such that 

Er{to,xo,yo) = 9{e). 

In Monte Carlo simulation, one generates a number of independent copies of T'^{t,x,y) and the 
estimate is the sample mean. The specific number of samples required depends on the desired accu- 
racy, which is measured by the variance of the sample mean. Because of unbiasedness, minimizing 
the variance is equivalent to minimizing the second moment. Jensen's inequality implies 

E{r{to,xo,yo)f > {Er{to,xo,yo)f = 0{ef. 
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This and (jO|) say that 



Umsup-elogE(r^(to,a;o,yo))^ < 2Gi{to,xo) 



Hence, 2Gi{t(),XQ) is the best possible rate of decay of the second moment. If 

hminf-elogE(P(to,a;o,2/o))^ > 2Gi{to,xo), 

then T'^{tQ, xqjI/q) achieves this best decay rate, and is said to be asymptotically optimal. 

It is important to note here that asymptotic optimahty is not the only practical concern. Rare 
events associated with multiscale problems are rather complicated and many times is it very difficult 
to construct asymptotically optimal schemes. One way to circumvent this difficulty is by construct- 
ing appropriate sub-optimal schemes with precise bounds on asymptotic performance. This is the 
content of Theorems 14.61 ITS) and H. 101 for Regime i = 1,2,3 respectively. 

Fix the Regime i = 1,2,3 and assume that we are given a control u{s, x, y; i) that is sufficiently 
smooth and bounded. Let us recall the 2k— dimensional Wiener process Z{-) = (VF(-), 
Consider the family of probability measures defined by the change of measure 

i-T 



^pe f 1 /■ If 1 

— =exp|-- \\uis,X%s),Y%sy,i)fds + -^J^ {u{s,X%s),Y%sy,i),dZ{s)) j . 
By Girsanov's Theorem 

Z{s) = Z{s) - ^ r u{p,X'{p),Y%py,i)dp, to<s<T 

is a Wiener process on [to,T] under the probability measure P*^, and {X^,Y^) satisfies X^(to) = xq, 
Y'^{to) = yo and for s G {t,T] it is the unique strong solution of (j3.4p with Z(-) = (W{-),B{-)^ 
in place of Z{-) = {W{-),B{-)) and u{s,x,y;i) = {ui{s,x,y;i),U2{s,x,y;i)) in place of u{s) = 
(mi(s),U2(s)). 
Letting 

T%to,xo,yo) = exp|-i/i(X^(r))| ^(X^y^), 

it follows easily that under P*^, T^{tQ,XQ,yo) is an unbiased estimator for 9{e). The performance of 
this estimator is characterized by the decay rate of its second moment 

(4.3) Q%to,xo,yo;u) = E' 



e^p^--h{X^{T))}[^^{X\Y^ 



We construct asymptotically efficient importance sampling schemes by choosing the control u in 
(j4.3p such that the behavior of the second moment Q''{t(), xq, yoi u) is controlled. Two are the main 
ingredients in the construction of u: 

(i) The gradient of a subsolution to the PDE that the function Gi{t,x) defined in (|4.ip satis- 
fies. Under appropriate regularity conditions Gi{t,x) satisfies a PDE of Hamilton-Jacobi- 
Bellman (HJB) type. 

(ii) The solution to the associated cell problem or in other words the so-called corrector from 
the homogenization theory of HJB equations. 

Depending on the regime of interaction the HJB equation and the corresponding cell problem 
take a different form. These will be made precise in Subsections 14. Hf4.31 

As mentioned before, we work with appropriate subsolutions to the associated HJB equation. 
Thus, let us now recall the notion of a subsolution to an HJB equation of the form 

(4.4) Gs{s,x) + H{x,V^G{s,x)) = 0, G{T,x) = h{x). 
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Definition 4.1. A function U{s, x) : [0, T] x i— >■ R zs a classical subsolution to the HJB equation 

O if 

(i) U is continuously differentiable, 

(ii) Us{s,x) + H{x,V^U{s,x)) > for every {s,x) G (0,r) x M™, 

(iii) U(T,x) < h{x) for x £ M™. 

We will impose stronger regularity conditions on the subsolutions to be considered than those of 
Definition 14.11 This is convenient for the purposes of illustrations since then the feedback control 
is uniformly bounded and thus several technical problems are avoided. However, we mention that 
the uniform bounds that will be assumed in Condition 14.21 can be replaced by milder conditions 
with the expense of working harder to establish the results. 

Condition 4.2. U has continuous derivatives up to order 1 in t and order 2 in x, and the first 
and second derivatives in x are uniformly bounded. 

Roughly speaking, our main result is as follows. 

Theorem 4.3. Consider a bounded and continuous function h : i— )• M and assume Conditions 
\2.1\ and under Regime 1 assume Condition \ 2. S[ Let {{X^{s),Y^{,s)) ,e > 0} be the solution to jj) 
for s S [to,T] with initial point {xQ,yQ) at time t^. Under Regime i = 1,2,3 let u{s,x,y;i) be an 
appropriately defined and smooth control in terms of a subsolution Ui{s,x) to the HJB satisfied by 
Gi{s,x) and the corrector from the corresponding cell problem. Then 

(4.5) liminf-e In Q'(to,xo,yo; ■"(•;«)) > Gi{to, xq) + Ui{to, xq). 

Once we have established a Theorem like 14. 3| we can make a claim for estimating probabilities 
of the form Pt^xo,yo[^'^{'^) ^ ^] as well. 

Proposition 4.4. Assume Conditions \2. 1\ and \4-.S\ and under Regime 1 assume Condition \2.2l Let 
{{X^,Y^) , e > 0} be the solution to with initial point (to, xq, yo). Under Regime i, let A C M™" 
be a regular set with respect to the action functional S" and the initial point {to,XQ,y()), i.e., the 
infimum of over the closure A is the same as the infimum over the interior A° . Let 

h{x) = [' 'f^^/ 
^ ' \+oo ifx i A. 

Let u(s,x,y;i) be an appropriately defined and smooth control as in Theorem \4.3\ Then ^.5^ holds. 

Proof. The claim of the proposition is not readily covered by Theorem 14.31 since the function h is 
neither bounded nor continuous. However, by an approximating argument analogous to [18] the 
claim can be established. We omit the details. □ 

Notice that the lower asymptotic bound of Theorem 14.31 and Proposition 14.41 is independent of 
the initial point yo of the fast component Y''. This is due to averaging. 

Remark 4.5. Since Ui is a subsolution, we get that Ui{s,x) < Gi{s,x) everywhere. By ( [^.5[ j 
this implies that the scheme is asymptotically optimal if Ui{to,xo) = Gi{to,xo) at the starting 
point {tQ,xo). Standard Monte Carlo corresponds to choosing the subsolution Ui = 0. Hence, any 
subsolution with value at the origin {tQ,xo) such that 

< Ui{to,xo) < Gi{to,xo) 

will have better asymptotic performance than that of standard Monte Carlo. 

In the next subsections we present how one can choose the controls u{s, x, y; i) in terms of a 
subsolution U and its corresponding cell problem such that the bound mentioned in Theorem 14.31 
is attained. The situation is subtle here due to the multiscale aspect of the problem. 
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4.1. Importance sampling for Regime 1. In this subsection we construct asymptotically effi- 
cient importance sampling schemes for Regime 1. In Regime 1, the form of the Hamiltonian H{x,p) 
in (|4.4p is naturally suggested by the calculus of variation problem (j4.ip and the explicit formula 
of the rate function S^j- (</>) in Theorem [ 

(4.6) 



H{x,p) = {r{x),p) - \{p,q{x)p). 



In fact, under mild conditions Gi from (14. ip is the unique viscosity solution to (I4.4p with H{x,p) 
defined by (gS]). 

We have the following Theorem. 

Theorem 4.6. Let {{X'^{s),Y'^{s)) ,e > 0} be the solution to for s G [to,T] with initial 

point (xo,yo) o,i time to- Consider a bounded and continuous function h : M™ i— t- M and assume 
Conditions \ 2. 11 \2.S\ and \4--S\ Let Ui{s,x) be a subsolution to the associated HJB equation. Define 
the feedback control u{s, x, y; 1) = {ui{s, x, y; 1), 7x2(5, x, y; 1)) by 



u{s,x,y; 1) 



dx 



dy 



o-+^Ti {x,y)Vr,Ui{s,x),- [ ^T2] {x,y)VxUi{s,x 



dx 



dy 



Then the conclusion of Theorem \4-3\ holds, i.e. 



liminf-elnQ^(to,a;o,2/o;'"(-; 1)) > Gi{to,xo) + ;7i(to,2;o). 



Before proceeding with the proof, we notice that the feedback control (14. 6p is essentially implied 
by the solution to the variational problem associated with the local rate function in the definition 
of the action functional for Regime 1, Theorem 13.51 

Proof. Note that under the given conditions u{s, x, y; 1) is Lipschitz continuous in (x, y), continuous 
in {t,x,y), and uniformly bounded. For notational convenience, we omit the subscript 1 from Gi 
and ui and we write {t,x,y) in place of {to,xo,yo). 

Boundedness of h and u imply by the representation formula ()3.3p and by the Lemma 4.3 of |15j 
that 



(4.7) 



-e\ogQ^{t,x,y;u) 



inf E 



T 



\v{s)\\^ ds 



T 



\u{s, X\s), Y'{s))fds + 2h{X'{T)) 



where v{s) = (u i(s), ^2(5)), n(s, x, y; 1) = (ni(s, x, y; 1), ^2(5, x, y; 1)) and {X, Y) satisfying 



dX^{s) 
dY'{s) 



X^{0) 
with 



6(X^(s),y^(s)) + c(X^(s),y^(s)) +a(X^y/)7;i(s) ds + V~ea(x^.s),Y'{s))dW{s), 



-[-f( X%s),Y%s) )+g( X%s),Y^{s) ) + n ( X%s),Y%s) ) vi{s) + ( X^{s),Y^{s) ) V2{s) 



(4i 



+- 



n [X'{s),Y'{s)]dW{s) + T2 [X'{s),Y'{s)]dB{s 



xo, y^(0)=yo 



c(s,x,y) = c(x,y) - cj(x,y)ni(s,x,y; 1) 

(4.9) g{s,x,y) = g{x,y) -Ti{x,y)ui{s,x,y;l) -T2{x,y)u2{s,x,y;l) 

The next step is to take the limit infimum in the representation (|4.7p . The right hand side of 
()4.7p can be bounded by below in the limit e J, using statement (ii) of Theorem 13.31 with two 
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differences. The first difference is that the functions c,g in the definition of the first component of 
appropriate viable pair (Ai,£^), see Definition 12.41 are replaced with c,g. Using Theorem 13.51 the 
local rate function takes the form 

L\x, /3) = ^ (/3 - fis, x)f q-\x) (/3 - f(s, x)) 

where f(s,x) = r(x) — jy {aui + ^nui + ^T2U2^ {s,x,y)fj,{dy\x). Here fi{dy\x) is the invariant 
measure defined in Condition 12.21 The second difference is the presence of the integral term 
— \\u{s, X'^{s),Y'^{s))\\'^ds. Using classical averaging arguments, see [B], appropriately modified 
to treat controlled processes, as in [H], this term can be replaced in the limit as e | by its 
averaged version with respect to n{dy\x). By recalling the formula for u = {ui,U2) we have for 
(f) £ ACi[t,T];R"') 



T 



y 



T _ 

\uiis,Hs),y)f + \\u2is,(l){s),y)f fM{dy\cl){s))ds = I (V^f7(s, (/.(s)), Q(x)V^.f7(s, 



Thus, we have 



lim inf — e log Q'^(t, x; u) 

e-S>0 



> inf 

4),4>{i)=x 



dx 



(j){s) - r{(j){s)) - 1^ l^aui + ■^'^1^1 + ■^■^2'U2 ) is,(j){s),y)fi{dy\(j){s)) 



ds 



inf 







r 




-J 


t J 






"1 


X 


u: 


2 



\Ul[S,Ct)[S),] 



(s)-r(<A(5)) 



\U2[S,(p[S) 



fi{dy\cl){s))ds + 2h{(b{T)) 



(0(s)-r(</)(5)),V,C7(s,0(s))) 



ds 



T 



{V^Uis, <Pis)), q{x)V,Uis, <Pis)))ds + 2/i(0(T)) 



inf [SlA^) + 2h{(P{T)) 



(4.10) 



(s) - r{^{s)),V,U{s, </.(«))) + -{VxU{s, (P{s)),q{x)V,U{s, (P{s))) ) ds 



In the first equality we have used the definition of = {ui,U2) whereas in the second equality we 
used the definition of the action functional by Theorem 13.61 

Given an arbitrary <j) S AC{[t,T];W"^) with (j){t) = x, the subsolution property implies that 



- {(Pis) - r(0(s)), V,U{s, cPis))) - -{VxtJ{s, cj^is)) , qWs))V ,U (s , 

> -dtU{s,^{s)) - {V,U{s,cPis))Ais)) 
= -^Uis,Hs)) 

Let us now integrate both sides on [t,T]. Using the terminal condition U{T,x) < h{x), we have 



T 



s) - r{<P{s)),V^Uis, + ^(V,.f7(s, q{^{s))V.,U{s, <P{s))) ] ds > -h{ct>{T))+U{t, x) 
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Thus, the right hand side of ()4.10p is bounded from below by 

inf [SlT{<P) + h{(l){T))]+U{t,x). 

Thus, since by definition G{t,x) = i^i^(=j^c{[t,T];R"^),<p{t)=x [^tri^) + we can conclude that 

liminf-elogQ'(t,x;n) > G{t,x) + U{t,x). 

This concludes the proof. □ 

4.2. Importance sampling for Regime 2. Let us now study the construction of efficient im- 
portance samplings for Regime 2. The situation here is more subtle than it is for Regime 1. This is 
also seen from the large deviations principle. Theorem 13.41 The key difference between the LDP for 
Regimes 1 and 2 is that C^^ x depends on {zi,Z2), while C^. did not. This means that relations 
between the elements of a viable pair are more complex, and in particular that the joint distribution 
of the control {zi,Z2) and fast variable y is important. Thus, in contrast to Regime 1 where the 
action functional can be written down explicitly, in the case of Regime 2 the formula of the action 
functional is in terms of value function to a variational problem. 

This implicit characterization partially carries over to the importance sampling. The optimal 
control is again in terms of a corresponding cell problem as it was for Regime 1 (recall the cell 
problem (|2.ip for Regime 1). The difference here is that the cell problem is defined implicitly 
rather than explicitly. As we discuss in Section [5l this is related to the homogenization theory of 
HJB equations. 

In what follows, the subscript 7 is to emphasize the dependence on 7 (see (jl.3p ). Define 



H.y{x,y,p,q,P,Q,R) = inf 

ui,U2GZ 



-(7(7 :P + 7-(tiT;^ + T2T2 ) : Q + ■^Tia : R + {'yb + c + aui,p) 



+ {if + 9 + riUi +T2U2,q) + ^ ||nif + ^ \\u2f 

= ^aa^ • ^ + (^1^1^ + ^2^2^) : Q + l^ia^^ -.R+ijb + c, p) 

(4.11) + (7/ + 9,q)-\ h^P + rhlf - ^ \\T:[qf 

The infimum in (|4.1ip is attained for 

(4.12) ui = -a^{x, y)p - rf (x, y)q and U2 = -T2 {x, y)q 

The control u = (^1,^2) motivates the asymptotically optimal change of measure in Theorem 14.81 
Let us now define the associated HJB equation of interest together with the associated cell problem. 
We start with the cell problem. For each fixed (x,p) consider the unique value H.y{x,p) such that 
there is a periodic solution ^ to the cell problem 

(4.13) H^{x,y,p,VyCy,0,VlCy,0) = H^{x,p) 

The unknown in ()4.13p is the pair (^^, Hy^ . As it can be obtained by Theorem IL2 in [2], .^^ is the 
unique (up to an additive constant) periodic solution to (j4.13p such that S C'^{W^~"^). Moreover, 
Hy{x,p) is continuous in x and concave in p (see Propositions 11 and 12 in [1]). 

Consider then the HJB equation (j4.4p with H{x,p) replaced by H.y{x,p). Under the standing 
assumptions, this HJB equation has a unique viscosity solution which we denote by G2{s,x). 
Actually, under mild conditions the value function of the variational problem ()4.ip is this unique 
viscosity solution. This can be derived as in [1] and it will be recalled in Section [5j 

In accordance to what we did for Regime 1, we consider a classical subsolution to that HJB 
equation, which we denote by U2{s,x), where the Hamiltonian is i/^(x,p). Notice that ^-y depends 
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on the triple {x,y,p) with {x,p) seen as parameters. In the computations p will be substituted 
by the gradient of the subsolution V ,JJ2{s^x). So, in principle and U2{s,x) are coupled. This 
coupling is in line with the coupling that appears in large deviations, see Theorem 13.41 

Similarly to what we did for Regime 1, we impose stronger regularity conditions. This is done 
to ease exposition. In particular, the following condition guarantees the feedback control used in 
importance sampling is uniformly bounded and that we can apply Ito formula directly without 
approximations. Thus, a number of technicalities are circumvented. 

Condition 4.7. U2 has continuous derivatives up to order 1 in t and order 2 in x, and the first 
and second derivatives in x are uniformly hounded. Similarly, is twice continuous differentiable 
in {x,y,p), periodic with respect to y and all of the mixed derivatives up to order 2 are bounded. 

The following verification theorem is the analogous of Theorem 14.61 for Regime 2. 

Theorem 4.8. Let {(X'^(s),Y'^{s)) , e > 0} be the solution to for s G [toi^] with initial point 
(^^Oiyo) time tQ. Assume that we are considering Regime 2. Consider a bounded and continuous 
function h : 1— t- M and assume Conditions \2.1\ Let £,-y{x,y,p) be the unique (up to a constant) 
periodic solution to the cell problem ^.IS^ ) and U2{s,x) be a classical subsolution according to Def- 
inition \4Tl\ and assume Condition \4.7\ Define the control u{s,x,y; 2) = {ui{s,x,y;2),U2{s,x,y;2)) 
by 

u(s, X, y; 2) = (-cJ^(x, y)VxU2{s, x) - rf (x, y)Vy^y{x, y, V xU2 (s, x)), -rj {x, y)VyC-y {x, y, VxU2{s, x 

Then the conclusion of Theorem \4-3\ holds, i.e. 

liminf-eln(5'(to,2;o,yo; w(-;2)) > G2{to,xo) + U2ito,xo). 

Proof. For notational convenience, we omit the subscripts 2 and 7 from G2,U2,(,'y and H../. Also we 
write ^{s, x, y) in place of ^ {x, y, VxU (s, x)) and {t, x, y) in place of (toi ^^o, yo) for the initial point. 
The first step is to write, as in Regime 1, that 

(4.14) -e\ogQ\t,x,y-u{-a)) 
1 



inf E 



2 



T rT 

\\v{s)fds- / \\u{s,X'{s),Y'{s)]2)fds + 2h{X'{T)) 
t Jt 



where {X, Y) satisfies (j4.8p with v{s) = {vi{s),V2{s)) and u(s, x, y; 2) = {ui{s, x, y; 2),U2{s, x, y; 2)). 
The next step is to rewrite the right hand side of ()4.14p . Recall the definition of the operator 
^ from Definition 12.31 with z = (zi, Z2). Denote by ^ the operator jC^.x with | in place of 7. 

We will write Cq^'^ to denote the operator with the control variable z = 0. 

Apply Ito formula to After some term rearrangement, we get 

(4.15) 

-J ^o!S(.)^(^'^'(*)'^'(^))^^ = J {^ytn{vi-n,) + T2{v2-U2))[s,X^{s),Y%s))ds+R,ie 
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where the random variable Ri{e,v) is 

Ri{e,v) = 6 ax,x/6,t)-^(T,X'{T),Y'{T))+ I dt^ { s, X^{s),Y^{s) ] ds + 



ds 



+ V^S / (V,e, adWis)) (s, X'{s),Y%s)) + \ {Vy(, ndW{s) + T2dB{s)) (s, X'is),Y'{s 



Under our assumptions, the random variable Ri{e, v) converges in L2 to zero as e, (5 4, uniformly 
in V £ A. 

Next, we apply Ito formula to U{t,x). Omitting some function arguments for notational conve- 
nience and using the subsolution property for {/, we get 



h{X'{T)) > U{t,x) + 



T 



-H{X'{s),V,U) + ( V,U, f + c + a{viis) - ui) ) ( s, X'{s),Y%s) ) ds + 



aa^ ( X'{s),Y'{s) ) : V,V,U ( s, X%s) ) ds + 



T 



(4.16) {V^U{s,X'{s)],a[X\s),Y'{s)]dW{s) 



Recalling the definition of H by ()4.13p and adding and subtracting the term J,^ ^i^^^i \^ ( ■^'^i^)^ ^'^{^) ) ds, 
relation (j4.16p becomes, after using (j4.15p . 



T 



h{X\T))-\J{t,x) > (e/<5-7) / (V,U is,X%s)) ,biX%s),Y'is)))ds 



+ 



T 



V^U s, X^s) , aiviis) - ui) (s, X^s), Y'{s) ))ds + 



f-T 

+ / {'VyC,Tl{vi{s) -Ul) +T2{V2{S) -U2))ds + 



+ - 



1 



T 



ui{s,X'(s),Y'(s);2 



ds + 



1 



T 



U2{s,X'{s),Y'{sy,2 



ds 



(4.17) +Ri{e,v)+R2ie,v) 

where Ri{e,v) was defined before and i?2(e, u) is as follows 



R2{e,v) = aa^ ■.V,V,u(^X%s),Y%s))ds + V~e (v,U (^s, X%s)^ ,a (^X%s),Y'{s)^ dWis] 



ds 



Under our assumptions, the random variable R2{e,v) converges in to zero as e, 5 | uniformly 
in V £ A. Recalling the definitions of the controls ui , U2 we get 
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hiX'{T))-Uit,x) > (e/5-7) / (V^U {s,X%s)) ,b {X%s),Y%s))) ds 



{U2,V2{S) - U2) {s,X'is),Y'is)) ds + 



1 

+ 2 



T 



ui(s,X'{s),Y'{sy,2 



ds + - 



T 



U2[s,X\s),Y\s)-2 



ds 



+Riie,v) + R2{€,v) 



Writing for notational convenience Ui{s) = Ui ( s, X'^(s), y^(s); 2 ) for i = 1,2, we get after some 
term rearrangement 



uiisW + \\u2{s)\\' ds > U{t,x)-h{X'{T)) + - \\ui{sW + \\u2{s 



T 



ds 



(4.18) 



{ui{s) ,vi{s)) ds - / {u2{s) ,V2{s)) ds + R{e,v) 



where R{e,v) = Ri{e,v) + R2{e,v) + {e/5 - -f) (V [s,X'{s)^ ,h[X'{s),Y'{s)'j^ ds. Since 

I — )■ 7 and Ri{e,v),R2{(-,v) converge in to zero as e,(5 | 0, Condition 14.71 implies that R{e,v) 
converges m L2 to zero uniformly in ?; G 
Inserting (j4.18p into (I4.14P gives us 



-e\nQ%t,x;u) > inf 

116.4 ' 



1 



v{s) -uis, X'{s),Y'{s) ds + /i(X^'''-"(T)) 



.2 A 

+U{t,x) + R{e,v)] 

Set v{s) = v{s) — u{s,X{s),Y{s)). Since v £ A, the representation formula (|3.3|) implies that 



E 



\v{s)fds + h{X{T)) 



> -elogEexp<^ — K^'iT)) 



Recalling that R{e, v) converges in to zero uniformly in v £ A as €,5 and using statement 
(ii) of Theorem 13.31 we get 

rT 



liminf—e log (5*^(^)3;)^) ^ liminf inf E 

e-5>0 €-5>0 veA 



1 



\v{s)\\' ds + h{X{T)) + R{e,v) 



1 



+ U{t,x) 



> linnnf-elogEexp<^ --/i(X'(T)) } +U{t,x) 



(4.19) =G{t,x) + U{t,x). 

This concludes the proof of the theorem. 



□ 



We conclude this subsection with the following remark. This remark relaxes the requirement of 
a solution pair {S,^{x,y,p),H^{x,p)) to the cell problem (|4.13|) to a subsolution pair. This can be 
useful in problems where solving the cell problem is difficult even numerically. 
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Remark 4.9. In the proof of the theorem, the definition of the cell problem ^-13^ was only used 
in j/^. iTp . However, it is easy to see that the inequality in j^. j7[ j would be true if instead of the 
solution pair to the cell problem, a subsolution pair was used, i.e. a pair (^-^(x, y,p), H'^{x,p)) such 
that H^{x,y,p,Vy£,-y,0,Vy£,-y,0) > H-y{x,p) for all y £ y and {x,p) G x W\ So, one can 



seek for subsolution pairs {^^{x,y,p), Hy{x,p)) to such that i-f{x,y,p) is periodic in y and 

H.y{x,p) is concave in p. 

4.3. Importance sampling for Regime 3. Finally, we study the construction of efficient im- 
portance samplings for Regime 3. The procedure here is similar to that of Regime 2. This is to be 
expected, since Regime 3 is a limiting case of Regime 2 obtained by setting 7 = 0. Therefore, we 
shall only present the result omitting the proof. The statement for the existence and regularity of 
a pair (^^Q{x,y,p), Ho{x,y,p)) satisfying (j4.13p with 7 = is given in Section [5l 

Theorem 4.10. Let {{X''{s),Y''{s)) , e > 0} be the solution to lil.l]) for s G {to,T] with initial 
point {xo,yo) at time to- Consider a bounded and continuous function h : 1— )• M and assume 
Conditions \2.1\ Let (^^(j{x,y,p), Hq{x,p)) be a pair satisfying the cell problem fi4.13\ ) with 7 = 



and 113(3, x) be a classical subsolution according to Definition \4.1\ with Hamiltonian Hq{x,p) and 
assume Condition \4.7\ with 7 = 0. Define the control u{s,x,y; 3) = {ui{s,x,y;3),U2{s,x,y;3)) by 

u{s, X, y; 3) = {-a'^{x, y)V^f73(s, x) - rf (x, y)VyCo {x, y, V^U^is, x)) , -rj {x, y)VyCo {x, y, V^fJ^is, 

Then the conclusion of Theorem 14- 3\ holds, i.e. 

liminf-eln(5^(to,a;o,2/o;'"(-;3)) > Gsito^xo) + U3{to,xo). 

e-5>0 

Notice here that even though in the statement for the large deviations for Regime 3 (Theorem 
13. 3p . we require that g{x, y) = g{y) and rj(x, y) = rj(y), in the statement of the related importance 
sampling lower bound we do not require that assumption. The reason is that in the proof of the 
importance sampling bound only the Laplace principle lower bound is used (compare with (j4.19p ) 
and that holds with the x— dependence as well; see the second statement of Theorem! 



5. Connection with homogenization of Hamilton-Jacobi-Bellman equations. 

It is evident from the calculations in Section |4] that there is an implied relation of importance 
sampling for multiscale problems and homogenization of a related class of HJB equations. In this 
section we aim to make this connection clear. We only outline the results that are relevant to the 
importance sampling results. We refer the interested reader to the literature of homogenization for 
Hamilton-Jacobi-Bellman equations for more detailed discussions, e.g. [H [2| [9l [20 | [25 | [31]. 

Let us define the function 

r(t,x,y) = E,.,, [e-h^^^'i^)) 

where {X^,Y'') is the strong solution to the uncontrolled process (jl.ip with initial point {X''{t),Y''{t)) 
{x,y). A straightforward computation shows that the function 

G'{t,x,y) = -elnr(t,x,y) 

solves the Hamilton-Jacobi-Bellman equation 

G\T,x,y) = h{x) 

where the Hamiltonian H^^§ is defined as in (j4.1ip with e/5 in place of 7. Under Conditions 12.11 
and 12.21 we have the following. 
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• In the case of Regime 1, we have that G'^{t, x, y) converges uniformly in compact subsets of 
[0, T] X M™" X R'^"™ to the unique bounded and continuous viscosity solution of (j4.4p with 
effective Hamiltonian given by (j4.6p . We refer the reader to [9l[T] for details. 

• In the case of Regime 2 the effective equation has again the form (j4.4p but the effective 
Hamiltonian is given by the unique constant such that the periodic cell problem (j4.13p 
has a unique (up to an additive constant) periodic solution G C'^{W^~'^) (see Theorem 
II. 2 in [2]). Under our assumptions, the effective Hamiltonian H^{x,p) is continuous in x 
and concave in p (see Propositions 11 and 12 in [Ij). 

• In the case of Regime 3 the effective equation has again the form ()4.4p but the effective 
Hamiltonian is given by the unique constant Hq such that the periodic cell problem ()4.13p 
with 7 = has a Lipschitz continuous periodic solution (see [H H]). Again, under our 
assumptions, the effective Hamiltonian Hq{x,p) is continuous in x and concave in p (see 
Propositions 3 in [l]). 

We gather our observations in the following remark. 

Remark 5.1. In the context of importance sampling, we observe two things: 

(i) the suhsolutions that we are considering are suhsolutions to the corresponding limiting HJB 
equations, and 

(ii) the cell problem arising in homogenization of HJB equations enters in the formulation of 
the importance sampling scheme in each regime. 

These imply that in Monte Carlo simulation for multiscale problems both the local information 
described by the corresponding cell problem and the homogenized information that is described by 
the solution to the HJB equation, enter the asymptotically optimal change of measure. As it is 
demonstrated in the numerical simulations presented in |15j . neglecting the local information and 
basing the simulation only on the homogenized information can lead to estimators that perform 
poorly in the small noise regime. 



In this section we present some simple examples from the existing literature to illustrate how our 
calculations look like. We consider two examples. The first one is the first order Langevin equation. 
As we said in the introduction this model can be used to model rough energy landscapes motivated 
by applications in chemistry; see also [301 [331 [Ml Ell [39] . This model was extensively discussed in 
|151 [T6] and the theory was also demonstrated by simulation results. We recall the formulas here 
for completeness for this particularly important example. The second example is related to short 
time asymptotics for processes that depend on another fast mean reverting process. Models of this 
nature appear in mathematical finance in the context of fast mean reverting stochastic volatility 
models, e.g., |22j . Assuming that we want to estimate 



for a given function h{x) and a given corresponding subsolution [/, we also provide the control that 
attains the desired bounds in Theorems 14.61 14.81 and 14.101 

6.1. The first order Langevin equation. We consider the first order Langevin equation 



6. Examples 



e{e) = E[e-^''(^^(^))|X^(0) = xo,F^(0) = yo] 



(6.1) dX%s)= -\vQi^^] -VV{X%s)) dt + ^/~eV2DdW{s), X'{0) = xq. 




To connect to the notation of the general model (jl.ip . this corresponds to 



f{x,y) = b{x,y) = -VQ{y), g{x,y) = c{x,y) = -W{x), Ti{x,y) = cr{x,y) = V2D, T2{x,y) = 0. 
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Let us consider the case of Regime 1 . The invariant distribution associated to the operator C} 
is the Gibbs distribution (independent of x) 



p{dy) = -e D dy, L = J e d 



dy. 



Moreover, Condition 12.21 is trivially satisfied. In dimension 1, an easy computation shows that the 
action functional takes the following explicit form 



1 



Sot{4>) 



T 



1, 



2./0 <?■ 

+00 



{s)-r{(j){s))fds if E ^C([0,r];M) and 



where 



and 



r{x) 



otherwise, 
\^V'{x) 2D\^ 



LL 



LL 



L = e o dy, L = e D dy. 
Jy Jy 

In addition, we can also compute the optimal change of measure in regards to the importance 
sampling problem. Given a classical subsolution U, the importance sampling control that appears 
in Theorem 14.61 takes the form 



u{s,x,y; 1) 



2DX QMr, X 

e D dxU{s,x),0 



L 



The choice of the subsolution U according to Definition 14.11 depends on the terminal cost of interest 
h{x). See also |15( I16j for some particular examples with specific choices of subsolutions U{s,x). 

6.2. Short time asymptotics and fast mean reversion. Next we consider a particular system 
of slow-fast motion, where the fast motion is a fast mean reverting process. The slow motion 
appears due to the interest in short time asymptotics. In particular, let us consider the system in 
1 + 1 dimension 



3.2) 



dX{s) = h{Y{s))ds + a{Y{s))dW{s), 

dY{s) = ^ (m - Y{s)) ds + ^ ^pdW{s) + \/l - p^dB{s) 



where < 5 ^ 1 is the fast mean reversion parameter and p £ [—1,1] is the correlation between the 
noise of the X and Y process. Assume that we are interested in short time asymptoptics. Then it 
is convenient to change time s i— )■ es with < e ^ 1. Writing the system under the new timescale, 
we obtain {{X^{s),Y^{s)) ,s G [0,T]} as the unique strong solution to: 



(6.3) 



dX'{s) 
dY'{s) 



eh {Y'{s)) ds + ^/^a {Y'{s)) dW{s) 



- (m - Y'{s)) ds + J- [pdW{s) + Vl - p^dB{s) 



Both components {X, Y) take values in M. We supplement the system with initial condition 
(X'^(O), y^(0)) = {xo,yo). To connect to the notation of the general model (jl.ip . this corresponds 
to 

b{x,y) = 0, c'(x,y) = eh{y), a{x,y) = a{y), 



f{x,y) = m-y, g{x,y) = 0, Ti{x,y) = p, T2{x,y) = \/l - p^. 
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/ -|(/>(s)pds if G ^C([0,r];M) and (/>(0) = xo 
Jo Q 



Of course, this system violates the periodicity assumption. However due to the mean reverting 
feature of the fast motion, the conclusions hold in this case as well. In the next subsections we see 
the form of the large deviations action functional and of the control that defines the asymptotically 
optimal change of measure for all three regimes. 

6.2.1. The case of Regime 1. A simple computation shows that the only possible solution to cell 
problem (12. ip is the zero solution (this is because 6 = 0). Also, it is easy to see that the invariant 
measure corresponding to the operator is independent of x and can be explicitly computed, 
taking the form 

Then this implies that the formula for the action functional (Theorem 13. 6p becomes 

5ot(<A) = < 2 

[ +00 otherwise, 

where q = Jy a"^ {y) ii{dy) . Given a classical subsolution C7, the importance sampling control that 
appears in Theorem 14.61 takes the form 

u{s, X, y; 1) = {-a{y)dxU{s, x), O) . 

As in the previous example, the choice of the subsolution U according to Definition 14. II depends on 
the terminal cost of interest h(x). 

6.2.2. The case of Regime 2. The situation here is more complicated because the infimization 
problem that appears in the definition of the local rate function. Theorem 13. 3^ does not necessarily 
have a closed form solution as it had for Regime 1. However, due to the one-dimensionality aspect 
of the problem we can still do some algebraic computations. A simple algebra shows that the 
formula for the action functional (Theorem 13. 3p becomes 

r 

L2{4>sAs)ds if (?i G ^C([0,r];M) and (/>(0) = xq 



where 



n 

2 

+00 otherwise. 







L2(x,/3)= inf \\\ \v{y)\^n^{dy) 

^^^1,0 I 2 Jy 



with 



^^(dy) = yfn^^im-z)+2pviz)]dz^y^ f jy[2^irn-z)+2pviz)]dz^y 



y 



Alp = ^v{-):y^R,l3 = j^a{y)v{y)fi,.idy) 
The equation for the related cell problem (j4.13p takes the form 



There is a unique pair {^j{y), H^{p)) satisfying this equation such that C-yiy) ^ ^ioc [21 [26]. 
Notice that for this model, the solution {(,^{y), Hy{p)) to the cell problem is independent of the 
slow motion x. Obtaining closed form solutions to such equations is difficult in principle, especially 
because we are interested in pairs {^^{y),H^{p)). Numerical methods such as the ones developed 
in |10tl24j will be useful here. Notice also that by Remark 14.91 appropriate subsolution pairs suffice. 
We plan to return to these issues in detail in a future work. 
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Given sufficient smootliness such tliat Theorem 14.81 is apphcable and a classical subsolution U 
(depending on the choice of the terminal cost h{x)), the importance sampling control that appears 
in Theorem 14.81 takes the form 

u{s,x,y;2) = (^-a{y)dxO{s, x) - pdy^^ {y,dxU{s,x)) , - p'^dyC-y {y,dxU{s,x)) 

6.2.3. The case of Regime 3. It turns out that we can make some explicit computations here. 
To simplify things we will assume for brevity that p = 1. With these assumptions we get that 
Ti{x,y) = 1 and T2{x,y) = 0. Assume that a S L^{y) and that fya{y)dy 7^ 0. A straightforward 
computation shows that the local rate function takes the form 



L3(x,/3) 




-dy 



■L 



/5 



v{y) fy (^{y)dy ' Jy v{y) L a{y)dy 



-dy 



This problem can be solved explicitly yielding 



S'ot(0) 



1 

2 

+00 







a{y)dy 



y 



and (/)(0) = xq 



ds if (f) e AC{[0,T]; 
otherwise, 

The equation for the related cell problem (j4.13p with 7 = takes the particular simple form 

This has the form of first order Bellman equation with quadratic Hamiltonian. Such equations 
have been studied in the literature and our assumptions guarantee that there are pairs (^O)-^o) 
such that ^0 is a continuous viscosity solution when ^0 ^ -^0 where Hq is a critical value. We refer 
the interested reader to [27j for an extensive discussion on this. 

If cr{y) is periodic in y, say with period A = 1, then 3^ = T = [0, 1] and we look for a periodic 
solution (,o{y)- It turns out that the Bellman equation can then be solved explicitly yielding 



Co{y,p)=p{y a{w)dw - a{w)dw 



Mp) 



Thus, indeed {(.oiy , p) , Ho{p)) satisfy the assumptions of Theorem 14.101 Given a classical sub- 
solution U (depending on the choice of the terminal cost h(x)), the importance sampling control 
that appears in Theorem 14. 101 takes the particularly simple form 



u{s,x,y;3) = {-a{y)dxU{s,x) - dy^o{y , d^U {s , x)) , O) 







(- 


/ a{w)dw 




Jo 



dxU{s,x),0 



7. Conclusions 

In this paper we have developed the large deviations theory and a rigorous mathematical frame- 
work for the importance sampling theory for systems of slow- fast motion like (jl.ip . All the possible 
cases of interaction of fast motion and intensity of the noise are considered. The asymptotic perfor- 
mance of the proposed schemes are in terms of appropriate subsolutions to related HJB equations 
and in terms of appropriate "cell problems". Straightforward adaptation of importance sampling 
schemes from standard diffusions without multiscale features lead to poor results in the multiscale 
setting. We have shown how the problem can be dealt with in the general multidimensional setting 
for fully dependent systems of slow-fast motion., when the fast motion is periodic. 
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